A Greedy Algorithm for Aligning DNA Sequences
نویسندگان
چکیده
For aligning DNA sequences that differ only by sequencing errors, or by equivalent errors from other sources, a greedy algorithm can be much faster than traditional dynamic programming approaches and yet produce an alignment that is guaranteed to be theoretically optimal. We introduce a new greedy alignment algorithm with particularly good performance and show that it computes the same alignment as does a certain dynamic programming algorithm, while executing over 10 times faster on appropriate data. An implementation of this algorithm is currently used in a program that assembles the UniGene database at the National Center for Biotechnology Information.
منابع مشابه
Identifying DNA and protein patterns with statistically significant alignments of multiple sequences
MOTIVATION Molecular biologists frequently can obtain interesting insight by aligning a set of related DNA, RNA or protein sequences. Such alignments can be used to determine either evolutionary or functional relationships. Our interest is in identifying functional relationships. Unless the sequences are very similar, it is necessary to have a specific strategy for measuring-or scoring-the rela...
متن کاملA new greedy randomised adaptive search procedure for multiple sequence alignment
The Multiple Sequence Alignment (MSA) is one of the most challenging tasks in bioinformatics. It consists of aligning several sequences to show the fundamental relationship and the common characteristics between a set of protein or nucleic sequences; this problem has been shown to be NP-complete if the number of sequences is >2. In this paper, a new incomplete algorithm based on a Greedy Random...
متن کاملgpALIGNER: A Fast Algorithm for Global Pairwise Alignment of DNA Sequences
Bioinformatics, through the sequencing of the full genomes for many species, is increasingly relying on efficient global alignment tools exhibiting both high sensitivity and specificity. Many computational algorithms have been applied for solving the sequence alignment problem. Dynamic programming, statistical methods, approximation and heuristic algorithms are the most common methods appli...
متن کاملAn Iterated Greedy Algorithm for Solving the Blocking Flow Shop Scheduling Problem with Total Flow Time Criteria
In this paper, we propose an iterated greedy algorithm for solving the blocking flow shop scheduling problem with total flow time minimization objective. The steps of this algorithm are designed very efficient. For generating an initial solution, we develop an efficient constructive heuristic by modifying the best known NEH algorithm. Effectiveness of the proposed iterated greedy algorithm is t...
متن کاملDevelopment of an Efficient Hybrid Method for Motif Discovery in DNA Sequences
This work presents a hybrid method for motif discovery in DNA sequences. The proposed method called SPSO-Lk, borrows the concept of Chebyshev polynomials and uses the stochastic local search to improve the performance of the basic PSO algorithm as a motif finder. The Chebyshev polynomial concept encourages us to use a linear combination of previously discovered velocities beyond that proposed b...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Journal of computational biology : a journal of computational molecular cell biology
دوره 7 1-2 شماره
صفحات -
تاریخ انتشار 2000